A less-greedy two-term Tsallis Entropy Information Metric approach for decision tree classification
نویسندگان
چکیده
The construction of efficient and effective decision trees remains a key topic in machine learning because of their simplicity and flexibility. A lot of heuristic algorithms have been proposed to construct nearoptimal decision trees. Most of them, however, are greedy algorithms that have the drawback of obtaining only local optimums. Besides, conventional split criteria they used, e.g. Shannon entropy, Gain Ratio and Gini index, are based on one-term that lack adaptability to different datasets. To address the above issues, we propose a less-greedy two-term Tsallis Entropy Information Metric (TEIM) algorithm with a new split criterion and a new construction method of decision trees. Firstly, the new split criterion is based on two-term Tsallis conditional entropy, which is better than conventional one-term split criteria. Secondly, the new tree construction is based on a two-stage approach that reduces the greediness and avoids local optimum to a certain extent. The TEIM algorithm takes advantages of the generalization ability of twoterm Tsallis entropy and the low greediness property of two-stage approach. Experimental results on UCI datasets indicate that, compared with the state-of-the-art decision trees algorithms, the TEIM algorithm yields statistically significantly better decision trees and is more robust to noise. © 2016 Elsevier B.V. All rights reserved. d c a c t i c i G a g o c t b t T c
منابع مشابه
Unifying Decision Trees Split Criteria Using Tsallis Entropy
The construction of efficient and effective decision trees remains a key topic in machine learning because of their simplicity and flexibility. A lot of heuristic algorithms have been proposed to construct near-optimal decision trees. Most of them, however, are greedy algorithms which have the drawback of obtaining only local optimums. Besides, common split criteria, e.g. Shannon entropy, Gain ...
متن کاملPlant Classification in Images of Natural Scenes Using Segmentations Fusion
This paper presents a novel approach to automatic classifying and identifying of tree leaves using image segmentation fusion. With the development of mobile devices and remote access, automatic plant identification in images taken in natural scenes has received much attention. Image segmentation plays a key role in most plant identification methods, especially in complex background images. Wher...
متن کاملComparison of Shannon, Renyi and Tsallis Entropy Used in Decision Trees
Shannon entropy used in standard top-down decision trees does not guarantee the best generalization. Split criteria based on generalized entropies offer different compromise between purity of nodes and overall information gain. Modified C4.5 decision trees based on Tsallis and Renyi entropies have been tested on several high-dimensional microarray datasets with interesting results. This approac...
متن کاملTsallis Entropy and Conditional Tsallis Entropy of Fuzzy Partitions
The purpose of this study is to define the concepts of Tsallis entropy and conditional Tsallis entropy of fuzzy partitions and to obtain some results concerning this kind entropy. We show that the Tsallis entropy of fuzzy partitions has the subadditivity and concavity properties. We study this information measure under the refinement and zero mode subset relations. We check the chain rules for ...
متن کاملEnsemble Classification and Extended Feature Selection for Credit Card Fraud Detection
Due to the rise of technology, the possibility of fraud in different areas such as banking has been increased. Credit card fraud is a crucial problem in banking and its danger is over increasing. This paper proposes an advanced data mining method, considering both feature selection and decision cost for accuracy enhancement of credit card fraud detection. After selecting the best and most effec...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Knowl.-Based Syst.
دوره 120 شماره
صفحات -
تاریخ انتشار 2017